-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amazon Ads: changes for increasing timeout for long taking report generation #10513
Amazon Ads: changes for increasing timeout for long taking report generation #10513
Conversation
Hi, @harshithmullapudi, |
...c/main/resources/config/STANDARD_SOURCE_DEFINITION/c6b0a29e-1da9-4512-9002-7bfd0cba2246.json
Outdated
Show resolved
Hide resolved
@@ -153,7 +153,7 @@ def read_records( | |||
@backoff.on_exception( | |||
backoff.expo, | |||
ReportGenerationFailure, | |||
max_tries=5, | |||
max_tries=1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These value increasing is not ideal (5 to 1000 and 30 seconds to 120 seconds). It's possible to find an optimal way to handle those situations? @misteryeo this is the case where we should add options to users have a better control in connector config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have added the timeout to 120 minutes and 1000 iterations because sometimes for our use case it takes more than 30 minutes (To be on the save side, we kept it 120 minutes). The same thing goes for max_tries to 1000.
It will be better if we can give these values user-defined @marcosmarxm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @marcosmarxm @harshithmullapudi ,
We have added an optional parameter for the user to enter report_wait_timeout. By default, it's 30 minutes, but if the user enters any time greater than 30 minutes, then that timing will be used.
Regarding max_tries, the majority of our time, we get reports at around 1000 retries. Thus we proposed to increase the value by 1000.
Since we are currently using this connector in our production, we request you to kindly merge this PR as soon as possible.
Hey can you run
Also that max retries can we move that into configuration rather than hardcoding it. For companies which are having less customers it would be consuming too many API requests |
For this type of UX change in spec I recommend to have a talk with @misteryeo and validate with connector team the best sollution. |
Hi @misteryeo, |
Hi folks, sorry for the delay here. Agreed with @harshithmullapudi and @marcosmarxm that we don't want to be hard coding these values as not everyone requires such high values. Let's move forward with adding both as options for users to adjust but have the original values as default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments! Almost there @aakashkumarmaple !!!
airbyte-integrations/connectors/source-amazon-ads/source_amazon_ads/spec.py
Outdated
Show resolved
Hide resolved
LABEL io.airbyte.version=0.1.3 | ||
LABEL io.airbyte.name=airbyte/source-amazon-ads | ||
LABEL io.airbyte.version=0.1.4 | ||
LABEL io.airbyte.name=airbyte/source-amazon-ads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @marcosmarxm , can you elaborate.
...ions/connectors/source-amazon-ads/source_amazon_ads/streams/report_streams/report_streams.py
Outdated
Show resolved
Hide resolved
@aakashkumarmaple do you want to continue this work? |
Hi @marcosmarxm, sorry for the delay. I will make the necessary changes and update you asap |
@marcosmarxm you want me to take over this? |
Let me review it @harshithmullapudi |
…te into marcos/test-pr-10513
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @aakashkumarmaple I'll publish the new version tomorrow.
What
Describe what the change is solving
Since Rate limits associated with the Sponsored Products reports API and the Sponsored Brands reports API depend on the size of the report generation queue, therefore for few report generation, it takes more than 30 minutes for reports to get generated.
Thus increasing the REPORT_WAIT_TIMEOUT to 120 minutes from the previous 30 minutes may solve a problem where the speed is affected by choice of region, utilization period, and reporting queue time.
Please Refer Document (Rate Limiting)[https://advertising.amazon.com/API/docs/en-us/concepts/rate-limiting#rate-limiting] for more information on Amazon Ads rate limiting conditions.
"""
Rate limits associated with the [Sponsored Products reports API](https://advertising.amazon.com/API/docs/en-us/sponsored-products/2-0/openapi#/Reports) and the [Sponsored Brands reports API](https://advertising.amazon.com/API/docs/en-us/sponsored-brands/2/reports) depends on the size of the report generation queue. These limits are determined on a per-region basis with multiple limit tiers that correspond to the current reporting queue size in a given region. Higher utilization periods with higher report queue sizes result in lower rate limits. Lower utilization periods result in higher rate limits with a higher number of accepted report generation requests.
"""
How
Upon changing the REPORT_WAIT_TIMEOUT to 120 minutes, the connector waits for more time for getting the data rather than giving up after 30 minutes. For this, we have made modifications in report_stream. Also, for maximum retries, we have also increased max_tried for _init_and_try_read_records for the same reason
Recommended reading order
report_stream.python
🚨 User Impact 🚨
No Breaking Changes